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Abstract: In this paper, we present an adaptive fault-tolerant event detection scheme for 
wireless sensor networks. Each sensor node detects an event locally in a distributed manner by 
using the sensor readings of its neighboring nodes. Confidence levels of sensor nodes are used 
to dynamically adjust the threshold for decision making, resulting in consistent performance 
even with increasing number of faulty nodes. In addition, the scheme employs a moving 
average filter to tolerate most transient faults in sensor readings, reducing the effective fault 
probability. Only three bits of data are exchanged to reduce the communication overhead in 
detecting events. Simulation results show that event detection accuracy and false alarm rate 
are kept very high and low, respectively, even in the case where 50% of the sensor nodes 
are faulty. 
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1. Introduction 

Wireless sensor networks often consist of a large number of small sensor nodes that cooperate to 
monitor real-world events and enable applications such as target tracking, military tactical surveillance, 
and emergency health care [1]. The detection and reporting of the occurrence of an interesting event 
is one of the important tasks of sensor networks. Due to limitations in available resources, such as 
power, memory and computing capability, sensor nodes deployed in a harsh environment, operating in 
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an unattended mode, are prone to failure. Faulty nodes might issue an alarm even though they are not in 
an event region. They degrade the network reliability, unless some provisions are made to tolerate them. 

Several distributed schemes for detecting events in the presence of faulty sensor nodes have been 
proposed in [2-5]. Krishnamachari and Iyengar [2] have mathematically proven that the majority voting 
is an optimal decision for the given model to detect events and correct faults. A single binary variable 
is used to represent a local event detection, resulting in low communication cost. Their simulation 
results show that 85~95% of faults can be reduced when fault rate is about 10%. Luo et al. [3] 
proposed a fault-tolerant energy-efficient event detection paradigm for wireless sensor networks. For 
a given detection error bound, minimum neighbors are selected to minimize the communication volume. 
Both Bayesian and Neyman-Pearson detection methods are presented. A localized event boundary 
detection scheme, exploiting the notion that readings from the event region and the normal region have 
different means but the same standard deviation due to noise, has been proposed in [4]. Actual sensor 
readings, encoded in 32 bits each, are transmitted and used in making a decision. The corresponding 
estimation may be more precise at the cost of increased communication overhead. Jin et al. [5] have 
employed a variable length event coding mechanism in event and event boundary detection to balance 
the communication cost and the estimation quality. Sensor nodes near the event boundary send the 
original sensor readings of 32 bits (with a 1-bit flag), whereas all others nodes use only two bits of 
message, instead. 

In [6], a fault-tolerant event boundary detection algorithm using a clustering technique based on 
maximum spanning trees is presented. Difference in sensor readings between any two sensor nodes 
is represented as the distance between them. Using the distances sensor nodes are classified into two 
clusters. With some additional computation on the clusters, event boundary nodes are determined. 

Most of the proposed event detection schemes based on a statistical model of noise may work 
effectively for a relatively low fault probability. As the fault probability increases, however, their 
performance degrades considerably. Moreover, the actual performance might differ significantly from 
the estimated one if faults behave differently from the model. 

In this paper, we present a distributed adaptive fault-tolerant event detection scheme for wireless 
sensor networks. It achieves high performance for a wide range of fault probabilities by employing 
a filter for tolerating transient faults and by dynamically adjusting the threshold for event detection 
depending on the fault status of sensor nodes. Confidence levels are used to manage the status of sensor 
nodes. Sensor nodes with a permanent fault (or behaving incorrectly for an extended period of time) are 
isolated from the network and reinstated later if some required conditions on confidence levels are met. 
Due to the adaptability of the proposed scheme both high event detection accuracy and low false alarm 
rate can be maintained even with increasing number of faults. 

The remainder of the paper is organized as follows. In Section 2, the system model and fault model 
are briefly described. Section 3 presents our adaptive event detection scheme employing a dynamic 
threshold selection. Filtering transient faults is also proposed to reduce the effective fault probability of 
sensor nodes. Simulation results are shown in Section 4. Conclusions are made in Section 5. 
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2. System Model and Fault Model 

As the system model we assume that sensor nodes are randomly deployed in the target area and all 
sensor nodes have the same transmission range r. Each sensor node receives the sensor readings of 
neighboring nodes and makes a decision on an event locally in a distributed manner. We define the 
average node degree d to represent the connectivity of the network. For convenience an event region 
is a circle with radius I. The proposed adaptive scheme, however, is expected to perform well even 
with different event region shapes. Each sensor node is assumed to know the range of normal sensor 
readings, and thus can make a decision on its own whether the sensed data lies in the range of normal 
readings or not and report a 1 (abnormal) or O(normal) accordingly. Apparently a faulty sensor or an 
event may produce abnormal data, and thus they are indistinguishable based on the readings of a single 
sensor node. All the sensor readings are assumed to be binary, without loss of generality. In the case of 
arbitrary values, comparison diagnosis presented in [7, 8] may be used instead. 

Three different types of faults in sensor readings, depending on their temporal behavior, are 
considered in this paper: permanent, transient, and intermittent [9-11]. In the case of a permanent 
fault, we assume that it causes an incorrect reading, either 1 or 0, consistently, with the same probability 
of 0.5, irrespective of the region it is in. Transient faults are assumed to be independent both spatially 
and temporally. A special type of intermittent fault which generates erroneous data periodically is also 
taken into account to estimate the adaptability of the proposed scheme. Although we focus on faulty 
sensors in this paper, the proposed scheme can possibly be extended to cover faulty communications 
with some degradation in performance by modeling faults in communication as sensor faults in the 
associated sensor nodes. 

Sensor networks are assumed to conduct fault detection periodically to manage fault status of sensor 
nodes. The period, however, is expected to be long enough to reduce the overhead incurred. Nevertheless 
the event detection performance can be maintained extremely high as long as most of the faulty sensors 
nodes are identified and isolated. 

3. Adaptive Event Detection Scheme 

In this section, we first describe the confidence levels of sensor nodes to be used in the proposed 
event detection scheme. We then present our adaptive event detection scheme using the confidence 
levels defined. Some erroneous readings due to transient faults will be corrected by employing a moving 
average filter to further enhance event detection performance. For convenience we list the notation to be 
used in this paper. 
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Notation 



Vi 


sensor node 


x\ 


sensor reading at node q at time k 


Vi 


filtered output of the input x\ (to tolerate most transient faults) 


R% 


threshold test result at q based on Xj and x'-s {i.e., neighbors') 


Hi 


threshold test result at q based on yi and y'^s 


Di 
Fi 


final decision on an event at i>, 
fault status of t>j (good,faulty) 


Fij 
di 

Jk 
d i 


fault status of Vj from the viewpoint of Vi (good,faulty) 
node degree of Vi 

effective node degree of vi at time k (i.e., number of neighboring nodes with 


1 


radius of an event region 


r 


transmission range 


j 
a 


average noae aegree oi a sensor networK (i. e., a — ^ ) 


jk 

a 


average effective node degree of a sensor network at time k (i.e., d = N l ) 


M 


window size for tolerating transient faults 


r 
0 


threshold for filtering transient faults 


Ci 


self confidence level of Vi 


Wij 


confidence level of Vj from the viewpoint of Vi 


Pp 


permanent fault probability 


Pt 


transient fault probability 


e 


threshold for event detection 



3.1. Confidence Levels 

In order to describe confidence levels of a sensor node and its neighbors a sensor network is modeled 
here as a weighted directed graph, G(V, E), where V represents the set of sensor nodes and E represents 
the set of edges connecting sensor nodes. Two nodes Vi and Vj are said to be connected if the distance 
between them dist(t>j, u,) is less than or equal to r (transmission range). Each node Vi is assigned a 
self-confidence level q. Each edge is also assigned a weight Wij, indicating the confidence level of 
Vj from the viewpoint of The confidence levels will be used to isolate potentially faulty sensor nodes 
from the rest of the network. They are also used to reinstate an isolated node if the confidence levels 
associated with it satisfy the required conditions to be addressed shortly. We use c min and c max to denote 
the range of the confidence level q. Also w min and w max will be used to indicate the range of Wij. 

An illustration is given in Figure 1, where six nodes are neighbors of the node t> 3 (i.e., six nodes are 
located within the communication range of t> 3 ) and confidence levels q and are assumed to be in 
the range of 0 to 1. In the figure, from the viewpoint of node v 3 , v 2 and t> 4 are nodes with the highest 
confidence while v 5 is a node with the lowest confidence. Among the six neighboring nodes of v 3 , v 5 is 
the most likely to be faulty, and will be ignored from v 3 if w min = 0. 
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Figure 1. An illustration of confidence levels. 




The confidence levels will be updated each time a fault detection or event detection is performed. All 
the Q and Wij are initialized to 1 (i.e., c max and w max ). They are increased or decreased by a (0 < a < 1) 
when the required conditions to be explained later are met. 

3.2. Filtering Transient Faults 

Event detection performance will degrade as the fault probability p increases. Hence reducing the 
effective p is desirable to make an event detection scheme robust to faults occurring in sensor networks. 
In order to do that, we use the confidence levels defined above to isolate faulty nodes and employ a 
modified moving average filter, to be discussed here, to correct some erroneous sensor readings due to 
transient faults. 

Let x\ represent sensor reading at node v\ at time k. Then the filter we employ takes an average of 
the last M readings, and x"~ M+1 , and sets the output to 1 if it passes a given threshold 5. 

Hence the output (i.e., filtered output at node vi) can be expressed as follows: 

Hi — \ j=n-M+l U) 

I 0 otherwise. 

Parameters, M (i.e., window size) and 5 (threshold) need to be properly chosen, depending on 
applications, for the best performance. They can be dynamically adjusted to enhance adaptability. 
As long as most of erroneous readings due to transient faults can be corrected, however, a high event 
detection performance can be obtained as will be shown in the simulation results in Section 4. Due 
to the fact that an event may cause abnormal sensor readings for an extended period of time, most 
transient faults can be filtered unless they occur repeatedly within the window. Although the types of 
faults may differ depending on applications, most random transient faults can be corrected even with a 
small window size. The resulting reduction in effective fault probability can affect positively on event 
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detection performance. 

Table 1 shows how erroneous readings due to some transient faults are corrected when M = 4 and 5 = 
0.75. For 2=1, the filter at node v\ will generate 0's even if x\ and x\ are 1. In the case of % = 5, where 
an event occurs at time 1 and t> 5 is assumed to be in the event region, the output becomes 1 with a delay 
of two cycles. That is, y\ becomes 1. 

Table 1. An illustration of filtering transient faults when M = 4 and 5 = 0.75. 
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Both x'jS and y's will be used in event detection as shown in Figure 2, where two identical blocks 
are employed to perform threshold tests (to be addressed shortly) with x'jS and y'jS, respectively. The 
resulting binary decisions, Ri and Hi, will be given to the subsequent decision block to make a final 
decision Di on an event. 



Figure 2. Proposed event detection scheme. 
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Di 



node Vi 



In the majority voting in [2], only the upper left threshold test block is employed like most other 
schemes, although the block could be functionally different. In our proposed event detection scheme 
both Ri and H { are used. The final decision Di on an event will be made based on Hi, while Ri is used 
as a warning of an event. 

3.3. Dynamic Threshold Selection 

In this subsection, we present our adaptive event detection scheme, focusing on the threshold test 
block in Figure 2, where the confidence levels introduced in the previous subsection will be used to 
dynamically adjust the threshold for event detection. The confidence levels, updated each time event 
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detection/fault detection is performed, are utilized to isolate potentially faulty sensor nodes and reinstate 
them if some given conditions are met. The resulting changes are to be reflected in the number of 
neighboring nodes {i.e., the effective node degree d\ at time k) of each node Vi, and it will in turn modify 
the threshold 9 for the next event detection cycle. In order to realize this adaptivity, each sensor node vi 
holds its fault status Fj, its self-confidence level q, the confidence levels of its neighboring nodes w^, 
and the fault status of node Vj from the viewpoint of i>j, F^. 

The proposed event detection scheme, where the threshold 9 is dynamically adjusted depending on 
the effective node degree, can be depicted as follows. Majority voting is used in the threshold test. Fi 
and Fy are initialized to 0 (good). 



Adaptive Event Detection Scheme 

1 . Obtain sensor reading x,- L and filter it to get y { 

2. Obtain sensor readings Xj, filtered outputs yj, and Fj from neighbors 

3. Set the threshold 9 to djl 

4. Determine hi, the number of neighbors with Xj = Xi 
Determine g«, the number of neighbors with yj = yi 

5. If qi > 9, then Hi <— y { , else H { <— — 
If bi > 9, then Ri <— xi, else Ri <— ->Xi 

6. Report an event (i.e., A = 1) if Hi=\ 
Report a warning if R { = 1 

7. Update the confidence levels q and 



In steps 1 and 2, each sensor node receives its own and neighbors' sensor readings (including filtered 
ones). Steps 3 to 5 are functions to be performed in the two threshold test blocks in Figure 2. In step 
3, the threshold value for majority voting to be used in step 5 is determined. Step 5 will set Ri (Hi) to 
either 0 or 1 depending on the number of matching neighbors obtained in step 4. Ri and Hi at node Vi 
can be set against its own readings if the node fails to pass the threshold. In step 6, the decision on an 
event will be made. Ri = 1 will be taken as a warning since it might occur due to transient faults. If it is 
an indication of an event, the decision on an event will be made at the time Hi becomes 1 . The warning 
must be given to its neighboring nodes to shorten the cycle time momentarily so that an event can be 
reported quickly. Confidence levels are updated in step 7. The confidence level of Vj from the viewpoint 
of Vi, Wij, is updated according to Table 2. 

As shown in Table 2, is increased by a only when Fj = 0 (good) and D { = yj. In other words, 
confidence level of Vj from the viewpoint of Vi becomes higher when both Vi and Vj have similar sensor 
readings and Vj is currently in the good state. The second and fourth rows decrease by a since 
Fj = 1 (faulty). 
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Table 2. Updating Wij at node v j. 



A = Z/j 


Fj 


Wij 


yes 


0(good) 


up 


yes 


1 (faulty) 


down 


no 


0(good) 


down for Di = 0 


no 


1 (faulty) 


down 



The third row can be explained using the following three representative cases among others. It lowers 
the confidence level of its neighboring node Vj only when D { is equal to 0. 

Case 1: Suppose that two good nodes Vi and Vj are neighboring each other and each of them is 
surrounded by sufficient number of good nodes to pass the threshold test. The first case occurs when vj 
becomes faulty and sends a 1 as shown in Figure 3. In this case, V; t will have Di = 0, y 3 ■, = 1, and Fj = 0 
(until Vj sets Fj to 1). Hence the conditions are met. The desired action at node Vi, as far as confidence 
level is concerned, is to lower the confidence level of Vj (i.e., Wij). 

Figure 3. Case 1 for the third row in Table 2. 




Case 2: The conditions can also be met when two good nodes, Vi and u,, neighboring each other are 
located in such a way that only one of them is in the event region, as illustrated in Figure 4. In the figure, 
Vi is in the event region and receives a 1 from v\ through v 4 and will eventually report an event (i.e., 
Di = 1). Meanwhile, Vj also makes the right decision of no-event (i.e., Dj = 0). When y { = 1 and 
yj = 0, as expected, Vi will have Di = \,yj = 0, and Fj = 0, satisfying the conditions. The conditions are 
also met for Vj since Dj = 0,y t = 1, and Fi = 0. The correct action in case 2, as far as confidence level 
is concerned, is as follows: (a) at node V; t , Wij needs to be increased, (b) at node Vj, Wji also needs to 
be increased. 

Case 3: It occurs when faulty nodes in close proximity, claiming to be good, are in an event region as 
shown in Figure 5 such that their readings are 0 as opposed to 1 (abnormal). Suppose that two nodes in 
the event region, Vi and Vj, are neighboring each other and Vj is one of the faulty nodes. Apparently Vj 
may have Dj = 0 since v 6 and v 7 are likely to report a 0 since they are outside the event region. Both vi 
and Vj meet the conditions. The proper actions in this case are (a) at node where Di = 1, yj = 0, and 
Fj = 0, Wij has to be lowered, (b) at node vj, where Dj =0,y i = l, and F { = 0, Wji needs to be increased 
to eventually change Fj to 1 . 
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Figure 4. Case 2 for the third row in Table 2. 




Figure 5. Case 3 for the third row in Table 2. 




For node Vi the above cases can be divided into two groups, depending on the value of Di. The first 
group (Di = 0) includes case 1, case 2(b), and case 3(b). Although the three cases in the first group cannot 
be distinguished based on the given information, the desired actions may differ. Only case 1 wants to 
lower the confidence level. The second group (Di = 1) includes case 2(a) and case 3(a), requesting 
conflicting actions. The third row in the table allows only case 1 to update the confidence level, ignoring 
all other cases. The reasons for taking this action are as follows. Confidence levels are maintained to 
isolated nodes with permanent faults or nodes behaving incorrectly for some extended period of time. 
Hence it is primarily intended to handle case 1. All other cases are related to events, which in general 
consume a relatively small portion of the entire monitoring time. In the case of an event, due to the 
conflicting requests, correctly updating confidence levels needs some additional information on the exact 
boundary of the event region, requiring more sophisticated computations. Hence momentarily stopping 
the updates in the case of an event may be appropriate since the network continues its monitoring function 
with most of the faulty nodes isolated. 
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Based on Table 2 the confidence level is updated as follows. 




max(w min , w i:j - a) if (A 7^ % and A = 0) or i^- = 1 
mm(w max ,w ij + a) if A = and Fj = 0 



otherwise 



(2) 



It is increased or decreased by a each time the conditions are met. The value of a needs to be chosen 
depending on the types of faults and applications. If a is relatively small, a node with transient faults is 
highly unlikely to be removed from the neighbor list. As a increases, however, it can be removed with 
an increased probability. Even if it is isolated, the node with only transient faults will be reinstated in 
our adaptive scheme. 

A potentially faulty neighboring node Vj of node t>; will be removed from the effective neighbor list 
of Vi as follows. If Fij = 0 (good) and = w min , Fy is set to 1 (faulty) and Vj is removed from Vi's 
effective neighbor list. On the other hand, if F {j = 1 (faulty) and = w max , F {j will be set to 0 (good) 
and Vj will rejoin the v^s effective neighbor list. Once a node is removed from the list (i.e., Wij = w min ), 
it can rejoin the list only when is increased and reaches w rnax . Similarly, once a removed node rejoins 
the effective neighbor list, it will remain there unless reaches w min again. 

Similarly the self-confidence level of Vi, Ci, is also updated in step 7. It is lowered if the decision 
made at Vi, A, is different from its own sensor reading filtered, yi, except for an event. 



Fault status Fi changes depending on the self confidence level Q. Fi will be set to 1 (faulty) when q 
becomes c min . Once it is set to 1, it will stay there until c, L reaches c max again. 

In the case where a good sensor node has more faulty neighbors, the node might be determined to 
be faulty, as illustrated in Figure 6, where q for Vi will be lowered due to the inequality A 7^ Vi- It, 
however, will highly likely be determined to be a good node with time. The node, v 3 , a neighbor of v^, 
will determine itself to be faulty if it cannot pass the threshold such that its confidence level C3 reaches 
0. In the figure, t> 3 has more good neighbors than faulty ones. Hence D 3 is highly unlikely to be y 3 . 
Once F 3 is set to 1, Vi will remove v 3 from its neighbor list. As a result, its effective node degree d\ 
will be lowered. If this also happens at V4, for example, the node is also removed from the list, and the 
node degree of Vi is further lowered. Finally, v,i passes the threshold, changes its fault status to 0 (good) 
some cycles later, and it can then be treated as a good node. If a larger number of faulty nodes are in 
close proximity, this recovery might not happen. The case, however, is extremely unlikely since our 
adaptive scheme removes faulty nodes as soon as identified. Unless all the nodes become faulty almost 
simultaneously, such a situation is unlikely to occur. 




max(c mira , ^ - a) if A ^ Vi 
mm(c max , Ci + a) otherwise 



(3) 
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Figure 6. A good node failing to pass the threshold due to neighboring faulty nodes. 




4. Simulation Results 

Computer simulation is conducted to evaluate the performance of the proposed event detection 
scheme. Our simulated sensor network consists of 1,024 sensor nodes, randomly deployed in a 
32 x 32 square region. Initially each node has about 12 neighboring nodes on average {i.e., d ^\2) 
in the simulation. Event region is assumed to be a circle with radius I = 2r, where r is the transmission 
range of each sensor node. Nodes with a permanent fault are assumed to consistently report an unusual 
reading (similar to stuck- at- 1) or a normal reading (similar to stuck- at-0) with the same probability of 
0.5, irrespective of the regions they are in. Both permanent and transient faults are considered and their 
probabilities are denoted by p p and p t , respectively. Hence the overall fault probability p is equal to 
Pp+Pt- In filtering transient faults, M (window size) and 5 (threshold) are set to 4 and 0.75, respectively. 
In the simulation, three different values of a, 0.1, 0.2 and 0.3, are chosen for comparison purposes. 

Three metrics, DA(event detection accuracy), FAR (false alarm rate) and ERDR (event region 
detection rate), are used to evaluate the performance of the proposed event detection scheme. FAR 
is defined as the ratio of the number of nodes reporting an event, in the case of no event, to the total 
number of sensor nodes. DA is the ratio of the number of times that events are detected to the total 
number of event occurrences. ERDR is the ratio of the number of nodes, in the event region, reporting 
an event (i.e., Di = 1) to the total number of nodes in the event region. Our objective is to keep high DA 
and low FAR simultaneously even when the fault probability is high. Although ERDR is not the main 
concern in this paper, statistical data for event region detection are obtained for future research. 

Table 3 shows DA for the proposed event detection scheme for various values of p t when p p is 
increased by 0.01 every 20 cycles up to 0.5. Based on the results we can claim that DA can be maintained 
high even with increasing number of faults. 
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Table 3. DA for various values of p p andp t . 
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Pt 


0.00 
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0.20 
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1.000 
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1.000 


1.000 


0.10 


1.000 


1.000 


1.000 


1.000 


1.000 


0.20 


1.000 


1.000 


1.000 


1.000 


0.999 


0.30 


1.000 


1.000 


1.000 


1.000 


0.995 


0.40 


1.000 


1.000 


1.000 


0.993 


0.949 


0.50 


0.999 


0.999 


0.997 


0.933 


0.753 



Figure 7 shows FAR with increasing permanent fault probability p p for various values of transient 
fault probability p t when a = 0.2. To see how the proposed scheme adapts to the increase in the number 
of faults, p p is increased by 0.01 every 20 cycles. FAR is kept very close to zero even when p p is 0.5. In 
the case of p t = 0.1 and p p = 0.2, for example, FAR is about 0.00006. That is, only 0.06 nodes out of 
1,024 make a false alarm even in the combined fault probability of 0.3. Sensor nodes with a permanent 
fault (producing erroneous data repeatedly for an extended period of time) can hardly affect the decision 
making process since they will be isolated from the network until they exhibit normal behavior again. In 
addition, the increase in transient fault probability p t , up to 0.2, does not cause any notable performance 
degradation due to the effective filtering of transient faults. 

Figure 7. FAR with increasing p p for various values of p t . 
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We have compared the performance of the proposed scheme with that of the majority voting. The 
results for p t = 0.1, 0.0 < p v < 0.5, and a = 0.2 are shown in Figure 8. Unlike the proposed scheme, 
FAR for the majority voting increases with p p , exhibiting a significant amount of false alarms. These 
false alarms will waste the network resources, resulting in a considerable reduction in network lifetime. 
On the other hand, ERDR for our scheme is lower than that of the majority voting. The reason for this 
degradation in ERDR is that correcting erroneous readings by employing a filter may reduce the number 
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of non-event sensor nodes incorrectly reporting a 1 (abnormal). In fact incorrect readings due to faulty 
sensor nodes near but outside an event region may affect positively on the event detection. 

Figure 8. Comparison between the proposes scheme and majority voting(MV) with 
increasing p v when p t = 0.1. 
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Similar simulation is done to compare the performance for three different values of a: 0.1, 
0.2 and 0.3. The resulting FAR and ERDR are shown in Figure 9, where the number in the 
parenthesis represents the value of a. As can be seen, the best performance is obtained for 
a = 0.1, although the performance difference between 0.1 and 0.2 is marginal. A notable degradation in 
performance can be observed for a = 0.3. This stems from the fact that some good nodes are removed 
from the neighbor list due to transient faults. 

Figure 9. ERDR and FAR for three different values of a when p t = 0.1. 
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In the proposed adaptive scheme, a sensor node Vi treats a potentially faulty sensor node Vj as a faulty 
node at the time the confidence level Wij reaches 0. The resulting reduction in effective node degree 
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of each sensor node, d k , will accordingly change the threshold 9 to adapt to the new network topology. 
Consequently faulty nodes can only affect the decision making process until they are identified and 
isolated. Due to the dynamic threshold selection, high event detection performance can be maintained 
even with increasing fault probability as shown in Figure 10, where p p is increased by 0.01 every 40 
cycles and an event is assumed to occur every 40 cycles. As expected, the average node degree d k (at 
time t = k) decreases and the number of false alarm nodes slowly increases with p p . The number of false 
alarm nodes moves up and down periodically due to the artificially generated periodic events. 

Figure 10. Average node degree d k and the number of false alarms when p p increases up to 
0.5 and p t = 0.2. 
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Figure 11. Average node degree d k and the number of false alarms when intermittent faults 
occur simultaneously every 80 cycles with the probability of 0.2. 
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Another simulation is performed to show how the proposed scheme adapts to a special type of fault, 
producing erroneous readings periodically for some period of time. For simplicity, each node is assumed 
to have such an intermittent fault with probability of 0.2 every 80 cycles, producing incorrect readings 
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for 40 cycles. The results are shown in Figure 1 1 , where the number of nodes that make a wrong decision 
soars up to more that 12 at the time such a fault occurs, but goes down to below 4 after a few threshold 
adjustments. Once the erroneous data due to the faults disappear, the threshold goes back to the original 
position, as expected. 

The proposed adaptive scheme has the potential to adapt to different fault patterns. The performance 
of the scheme will further be investigated by generating various types of faults discussed in [12]. 

5. Conclusions 

In this paper, we proposed an adaptive fault-tolerant event detection scheme for wireless sensor 
networks. It maintains high performance, in terms of detection accuracy and false alarm rate, for a wide 
range of fault probabilities, by employing a dynamically adjusted threshold and a filter for tolerating 
transient faults. Simulation results show that the scheme mitigates the negative influence of various 
types of faults by exploiting adaptation to temporal behavior of faults. Although we focused on faulty 
sensors, the scheme can be extended to cover faults in communication with minor modifications. Only 
three bits of information are exchanged each event detection cycle to reduce the communication cost. 
More extensive simulation is currently being conducted to estimate how the scheme performs for various 
event region shapes. 
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